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Summary 

f-H , Small area estimation has received enormous attention in recent years due to its wide range 

-H ! of application, particularly in policy making decisions. The variance based on direct sample size 

2 ' of small area estimator is unduly large and there is a need of constructing model based estimator 

with low mean squared prediction error (MSPE). Estimation of MSPE and in particular the bias 

K^ , correction of MSPE plays the central piece of small area estimation research. In this article, a 

r^^ I new technique of bias correction for the estimated MSPE is proposed. It is shown that that the 

O ' 

T^lj- ' new MSPE estimator attains the same level of bias correction as the existing estimators based 

o' 

^^ ■ on straight Taylor expansion and jackknife methods. However, unlike the existing methods, the 

O _ 

^^ , proposed estimate of MSPE is always nonnegative. Furthermore, the proposed method can be 



C^ 



a 



used for general two-level small area models where the variables at each level can be discrete 
or continuous and, in particular, be nonnormal. 
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1 Introduction 

Small area estimation is an important statistical research area due to its growing demand from 
public and private agencies. The variance of a small area estimator is unduly large due to 
smallness of the area-level sample size. Use of models has proven to be unavoidable to control 
the mean squared prediction error (MSPE) of a small area predictor. The bias correction of 
the estimated MSPE is the central piece of small area estimation research. See Rao (2003) and 



references therein for a detailed discussion. 

The standard small area models are usually two-level models, where one is a sampling model 
and the other one is a population model. Prasad and Rao (1990) assumed normality at both 
levels and used AN OVA estimates of the model parameters to derive second order correct MSPE 
estimates. Lahiri and Rao (1995) relaxed the normality assumption at the population level 
and re-establish the Prasad-Rao result on second order correct MSPE estimation. Datta and 
Lahiri (2000) investigated properties of Prasad-Rao (PR) type MSPE estimators for maximum 
likelihood and restricted maximum likelihood estimates of the model parameters, retaining the 
normal distribution assumption at both the levels. Recently, Jiang, Lahiri and Wan (2002) 
proposed a jackknife based MSPE estimators where normality is not a requirement. However, 
the Jiang-Lahiri-Wan method (JLW) requires a closed form expression for the posterior risk 
which is not often available (e.g., the binomial-normal model). Moreover, the JLW estimator has 
the undesirable property that it may produce negative MSPE estimates (Bell, 2002). Although 
the PR type MSPE estimates are nonnegative for the normal-normal case, the nonnegativity 
property is unknown for other situations. The PR type MSPE estimators correct the bias of 
the estimated MSPE using Taylor's expansion. On the other hand, the JLW MSPE estimator 
corrects the bias of the posterior risk using the jackknifing method. In this article, we propose a 
new technique of MSPE bias correction which attains the same level of accuracy as that of the 
PR type or the JLW MSPE estimators. In addition, the new MSPE estimates are guaranteed to 
be nonnegative. Moreover, the new method is valid for any family of parametric distributions, 
discrete or continuous. Thus, unlike the traditional methods, neither the normality assumption 
nor the choice of a specific parameter estimation method are required for the validity of the 
proposed approach. 

The organization of the paper is as follows: The next section introduces the two-level 
small area models and discusses the existing MSPE estimation methods in this framework. 
The Section 3 proposes the new MSPE estimator. Some technical properties of the proposed 
estimator are discussed and compared with the existing methods in Section 4. Section 5 reports 
finite sample properties of the new estimator using a simulation study. Some conclusions and 



comments are made in Section 6. Proofs of the technical results are given in the Appendix. 

2 Existing Methods of MSPE Estimation 

Consider the two-level small area model 

Vi = Oi + e^, ei^F{.;Di) (2.1) 

Oi = xff3 + ui, Ui^G{.;j) (2.2) 

i = 1,- • • ,m, where yi,- " ,ym are direct estimators with sampling errors ei, • • • , e^, indepen- 
dently distributed with cumulative distribution functions F{.; Di),- • • ,F{.; Dm), respectively; 
ui,- ■ ■ , Um are independent and identically distributed (iid) random variables with common dis- 
tribution function G(.; 7), and Xi,- ■ ■ Xm are p-dimensional nonrandom covariates. We suppose 
that the unknown parameters of the model are given by the regression parameter /3 and the 
p- dimensional parameter 7 of the random effects distribution G(.;7) in (2.2), but the values 
of Z?!, • • • Dm are known, as typically assumed in two- level small area models. We also assume 
that the sampling errors ei, • • • , Cm, and random effects ui, • • • , Um are mutually independent 
with E{ei) = = E{ui),i = 1,- ■ ■ ,m. Note that neither the e^'s nor the Ui's are required to be 
normally distributed. In fact, under (2.1) and (2.2), the Cj's and the n^'s are allowed to have 
arbitrary parametric families of discrete or continuous distributions. 
Suppose that the quantity of interest for prediction is given by 

h{ei),i = l,---,m (2.3) 

for some smooth function h : ]R ^ JR. For example, h{x) = x is the most commonly used 
function, which may correspond to area level means or totals. An important example of h{.) 
includes exponentiation in U.S. Census Bureau's ongoing Small Area Income and Poverty Es- 
timation (SAIPE) project. For county level poverty estimation in SAIPE, the model (2.1) and 
(2.2) applies after log transformation of the original data. 



The best predictor (BP) of h{9i) is given by 

Hi{S) = Eg{h{ei)\yi), i = l,---,m (2.4) 

where 5 = {(3 ,7 ) is the vector of model parameters. Since the true value of 5 is unknown, 
Hi{8) is not directly usable in practice. It is customary to substitute an estimator 5, say, of 5 
and predict h{9i) by using the estimated best predictor (EBP) as 

H,{5),i = l,---,m. (2.5) 

Performance of the EBP is measured by the mean squared prediction error (MSPE) : 

M,{5) = Eg[H,{5)-h{ei))\i = l,---,m. (2.6) 

Further, like the EBP, an estimator of the MSPE is obtained by Mj((5), i = 1, • • • , m. However, 
as pointed out by Prasad and Rao (1990) in their seminal paper, this naive plug-in estimator 
is not very useful. To appreciate why, note that Mi{6) can be decomposed as 

Mi{5) = E^[Hi{5)-h{di)f 

= Eg {H,{S) - h{ei)f + Eg [h,{6) - H,{S)y 

= Mu{S) + M2i{S),i = l,---,m, (2.7) 

where the cross-product term vanishes as a consequence of the fact that Eg {Hi{d) — h{6i)) Z = 
for any o"(yi, • • • , ym,)-nieasurable random variable Z. In (2.7), the first term Mii[5) is the 
optimal prediction error using the unknown ideal predictor Hi{5) and is of order 0(1) as 
m -^ 00. The second term M2i{8) arises from the error in estimating the unknown model 
parameters 5 in the BP Hi{S), and, typically, it is of the order 0{m^^) as m ^ 00. Prasad 



and Rao (1990) showed that by substituting S for d to define the naive plug-in estimator 

M,{S) = Mu{S) + M2i{S), 

one introduces an additional bias of the order 0{m^^), which is of the same order as the second 
term M2i{S) in (2.7). As a result, the naive estimator has a masking effect on the bias of the 
EBP and hence, it is not a good estimator of Mi{d), particularly when m is not too large. 

Prasad and Rao (1990) suggested a bias corrected estimator of the MSPE Mi{S) for a 
normal-normal model. The key idea there is to estimate the (leading term of the ) bias of 
Mi{d) using explicit analytical expressions. The bias corrected estimated MSPE, proposed by 
Prasad and Rao (1990), is of the form 

Mf^ = Mu{5) - Bias, + M2i{S), (2.8) 

--r — PR 
where Bias, is obtained by estimating higher order terms in the Taylor's expansion of the 

function Mij(.) around S. 

An alternative approach, put forward by Jiang, Lahiri and Wan (2002), involves using the 

jackknife method to correct the 0(m~^)-order bias term in the naive estimator Mi(S). More 

specifically, the bias-corrected estimator of Mi{S) of JLW is given by 

-. jrur - ' ~JLW 

^JLW ^ ^^.(^) _ ^•^^. ^ M2i{S), (2.9) 

~JLW 

where Bias^ is the Jackknife estimator of the bias of Mii{6). 

Although, the estimators Mf^ and Mf^^ have superior bias properties, an undesirable 
feature of both of these estimators is that they may produce negative MSPE estimate with 
positive probabilities. This results from the sampling variability of the bias estimators, which 
may dominate the value of the unadjusted naive estimator Mi{S) and thereby, may lead to a 
negative value of the bias corrected MSPE estimators. 

In this article, we propose a different approach to bias correction that is guaranteed to 



produce a nonnegative estimate of the MSPE. The key idea here is to tilt suitably the value of 
S, an initial estimator of d, before evaluating the function Mj(.), such that the difference between 
the true MSPE Mi{S) and the value of the function Mi{.) at the new value of the argument, 
say S, is smaller on the average. Since the MSPE function Mj(.) is always nonnegative, the 
resulting estimator of the true MSPE is always nonnegative. The tilted value S is constructed 
from 6 using the data- values only and hence it is itself an estimator of 5. In constructing d, 
we implicitly correct the bias of Mii{S), by making use of estimates of linear combination of 
the bias and the variance of the initial estimator S. Here, we employ the bootstrap method 
(Efron, 1979) to derive the bias and variance of the estimators of model parameters, although 
other methods such as the jackknife and the delta methods, are equally applicable. The details 
of the correct construction are given in the next section. 

3 The Proposed Estimator of the MSPE 

3.1 Motivation 

To motivate the definition of the proposed MSPE estimator, consider a related deterministic 
approximation problem, where we wish to approximate the value of a smooth function f : M ^ 
iR at a point a £ M using its values over an interval / containing a. For a given c 7^ 0, setting 
Xm = a + -j=, m > 1 and using Taylor's expansion, we get 

f{xn.) = f{a) + {xra " a)f{a) + ^{xm " af f" (a) + 0{m~l). (3.1) 

This suggests that starting with Xm, we may now construct a new point Xm G Im of the form 

"^TTl — '^Tfl I ^Tfi 5 ^ LI Oil Llldiu 

f{xra) = f{a) + 0{m~'i). (3.2) 

Indeed, by Taylor's expansion of f{xm) around a, we have 

1 3 

f{Xm) = f{a) + {Xm +Cm- o) f {o) + -{Xm + Cm - of f" {o) + 0(m"2 ), (3.3) 



which satisfies (3.2) if 

{xm + Cm- a)f'{a) + -{Xm + Cm - of f" {a) = 0. (3.4) 

Now equation (3.4) can be solved for Cm (yielding the solution Cm = — 1,(1 — {xm — o)) to find 
the desired point Xm- In deriving the proposed MSPE estimator, we employ an extension of 
this simple idea to the function /(•) = Mij(-) which is now a function (of several real variables) 
from JR'' -^ M. The role of the point Xm is played by an initial estimator S. Some additional 
care is needed to ensure that the analog of the tilted point Xm, now denoted by S, is truly an 
estimator, i.e., a function of the data alone and does not involve any parameters (e.g., it may 
not involve the point "a" in Xm, which represents the true parameter value S in our application). 

3.2 Definition of the proposed estimator 

Let 5 be a given estimator of S and let b = b{S) = Es(S — S) denote the bias and V = 
V{S) = Varg{S) denote the variance matrix of S at S. We shall suppose that some consistent 
estimators b and V of the bias and the variance matrix of the initial estimator S are available. 
For example, these may be generated by a suitable resampling method; see Section 5 where we 
use a parametric bootstrap method for this purpose. To define the tilted estimator of d, we 
also suppose that for i = 1, • • • , m 

k 

Y,\Mii\d)\ ^ 0, (3.5) 

where for a differentiable function / : M -^ M, /"^ and /" -* denote the first and the second 
order partial derivatives with respect to the j-th co-ordinate and the (j, l)-th. co-ordinates, 
respectively, j,l = l,---,k. Condition (3.5) says that at least one of the first order partial 
derivatives of the function Mij(-) is nonzero at the true value of the parameter d for each i. For 
notational simplicity, without loss of generality, we suppose that M^- (S) / 0. Then, we define 



the preliminary-tilted- estimator of 6 for the i-th sniaU area by 



Sr 



j=l 3=1 1=1 



M[^;>{5)y\, (3.6) 



where ei = (1,0, ••• ,0)"^ G IR^ , b{j) denote the j'-th component of b and V{j,l) denote the 
(j, /)-th element of V. Thus, the estimator Si is obtained from the initial estimator S by adding 
a correction factor to the first component of S only. Note that if instead of M-[^ (5), a different 
partial derivative M^- (S) were nonzero, then we would define the preliminary tilted estimator Si 
by replacing the factor < M^-'{S) > e.\ in (3.6) with < M\^(S) \ e^, where the vector e/ G TE& 
has 1 in the /-th position and zeros elsewhere, \ <l <k. 

Next, let A denote the set of possible values of the parameter S under the model (2.1) and 
(2.2). Then the tilted estimator of S for the i-th small area is defined by 



k 



Si if 5i G A and |MJ,^V^)|-i < (1 + logm)^ 

(3.7) 

S otherwise 



i = 1, • • • ,m. Thus, if the preliminary estimator Si takes values inside the parameter space A 
and the value of the partial derivative M|j (8) at S is not too small, the tilted estimator of S is 
given by dj itself. However, in the event that either dj falls outside A or M\^ (6) becomes too 
small, we replace it with the original estimator S. Small values of M{^ (d) make the estimator 
Si unstable and hence, truncated below. It will be shown in Section 4 that under appropriate 
regularity conditions, the probability of getting a preliminary estimator Si outside A or that of 
getting a value of M\^ (6) below the threshold (1 + logm)~^ tends to zero rapidly as m ^ cx), 
uniformly in i. As a consequence, the tilted estimator Si coincides with the preliminary tilted 
estimator Si with high probability. The proposed estimator of the MSPE is now defined as 

M^) = Mu(8) + M2,{S),i = 1, • • • , m. (3.8) 

Note that by the construction, the MSPE estimator is always positive. In the next section, we 



show that under some regularity conditions, it has a bias that is of the order o{m~^). Therefore, 
the proposed estimator attains the same level of accuracy as the previously proposed estimators 
Mf^ and M^^^ , while at the same time, guarantees positivity. 

4 Theoretical Properties of the Proposed Estimators 

In this section, we describe some theoretical properties of the tilted estimator 5i of (3.7) and 
of the bias corrected MSPE estimator Mi{5) of (3.8). For proving the result of this section, we 
shall assume the following regularity conditions on the model (2.1) and (2.2). 

Condition S: 

1. 6, the true value of the parameter, is an interior point of A. 

2. Mii is twice continuously differentiable on A and there exists a constant Ci G (0, oo) such 
that 

for all a; G A,j,l = 1, • • • , A; and i = 1,- ■ ■ ,m,m > 1. 

3. (i) M2i is differentiable on A. 

(ii) There exist C2, eo G (0, 00) and 7 G (0, 1] such that 

\M[f\x) - Mif (d)| + m\M^i\x) - M^{S)\ < C2\\x - 6^ 

for all a; G AA = {\\x — d\\ < eg} for j,l = 1,- ■ ■ ,k;i = 1,- ■ ■ ,m,m > 1. 

(iii) There exist a constant C3 G (0, 00) and a function G : M'' -^ [0, 00) with EG{6) < 00 

such that 

\M2i{x)\ < m~'^G{x) for ah a; G A, 



and 



\M^\d)\ < G3m-\ 



for all j = 1,- ■■ ,k;i = 1,- ■■ ,m,m > 1. 

We now briefly comment on the regularity condition S. Condition S requires the functions 
Mii and M2i to be smooth, which typically holds under suitable smoothness conditions on the 
parametric model (2.1) and (2.2). As mentioned earlier, in most applications the function Mu 
is of the order 0(1) while M2i is of the order 0{7n^^) as m ^ oo. Condition S requires that 
the partial derivatives of these functions also have the same orders, respectively. Condition 
S.3(iii) is a local Lipschitz condition of order ry G (0, 1] on Mu and M2J. This condition holds 
with ?7 = 1 if Mu is three-times continuously differentiable and M2i two-times continuously 
differentiable on a neighborhood of the true parameter value S. 

Next, suppose that the bias and the variance matrix of the given estimator S are of the 
form: 

b = E(S-d) = — + o(— ) as m ^ 00 (4.1) 

m m 

5] 1 

V = Var{S) = h o( — ) as m ^ 00. (4.2) 

m m 

Let a and 1] be estimators of the parameters a and S in (4.1) and (4.2) respectively, such that 
for some ry G (0, 1], 

^||a-af+'' = 0(1) as m ^ 00 (4.3) 

^||5]-S;f+'' = 0(1) as m^oo. (4.4) 

Note that in the notation of Section 3, the quantities b, a, V and X) are related as fo = m^^a 
and V = m^^Yi. 

With this, we are now ready to state the main results of this section. The first result 
shows that the preliminary-titled-estimators 8i converge to the true parameter 8 in probability 
uniformly in i = 1,. . . ,m, and also that the first order partial derivative M-[^ (^) falls below 
the given threshold (1 -|- log?7i)~^ with very small probability, uniformly in i = 1, . . . , m. 

Theorem 1: Suppose that (4.1)-(4.4) and condition S hold. Then 

10 



(i) for any e G (0, cxd), 

max PI \\di — S\\ > e] = OimT ) as m ^ oo. 
l<i<m \ J 



(ii) As m ^ cx), 



max p(\m\]^\5)\ < {l + logmy^) = 0{m-^] 

l<i<m \ / 



Proof: A proof of the theorem is given in the Appendix. 

As a direct consequence of the above result, we get the foUowing. 
Theorem 2: Under the conditions of Theorem 1, 

max P( 6; = 6-; ) = 1 — Oim^ ) as m ^ co. 
l<i<m \ J ^ ^ 

Proof: A proof of the theorem is given in the Appendix. 

Theorem 2 shows that uniformly in i, the titled estimator Si coincides with the preliminary- 
titled-estimator Si with high probability when m is large. Thus, the typical value of the titled 
estimator has a correction term added to the first component of the given initial estimator S 
(cf. (3.6)). The next result shows that this correction factor indeed reduces the bias of the 
proposed MSPE estimator Mi{S) to order o{m^^), as desired. 

Theorem 3: Suppose that (4.1)-(4.4) and condition S hold. Further suppose that 

h^{S-S)f} ^^ (4.5) 

is uniformly integrable. Then 

Tnayi \EM~{5) - Mi{8)\ = o{rrr^) as m^oo. (4.6) 

l<j<m 

11 



Proof: A proof of the theorem is given in the Appendix. 



5 Simulation Study 

We conduct a small simulation study to check small sample performance of our proposed MSPE 
estimator and compare it with its competitors. In order to mimic a real life study, we consider 
the example in Battese, Harter and Fuller (1988) to estimate the area under corn and soybeans 
for twelve counties of north-central Iowa. Originally, Battese et al. (1988) applied a nested 
error regression model. We consider here the area level version of their model for simplicity 
and we think that this is adequate for illustration purposes. Let yij be the area under corn for 
j'-th segment in i-th county and let Xi be the (population) average number of pixels classified 
as corn in the i-th county. We consider the area level model as 

yi = I3f) + PiXi + Ui + ei, i = 1, • • • , m (5.1) 

where yi = —YllLiUij — the sample average area under corn in the i-th. county. Here, Uj's 
are independently distributed with each following the N[0,a'^) distribution and the Cj's are 

2 

independent with Cj ~ N{0, Di) for i = 1, . . . , m where Di = —. Further, the Ui's and the Cj's 
are independent. In our simulation, we take /3o = 43.00, /3i = 0.25, a^ = 140.00, cig = 147.00. 
The nj's are as given in Battese et al. (1988) with mini<j<m?T'j = 1, maxi<j<mnj = 6 and 
m = 12. For the simulation study, we generated R = 20, 000 sets of samples using model (5.1) 
and computed S = (/3o,/3i,i5"„)"^ each time. 

For estimating the bias and the variance of the estimator vector S used in the definition 
of the titled estimators (5j's, we employed a parametric bootstrap method. For the sake of 
completeness, here we briefly point out the main steps of the bootstrap procedure. 

• Step (I): Generate independent random variables {e*}'^i and {u*}™ ^^ with e* ~ N{0, Di) 
andu* ~Af (0,0-2). 



12 



• Step (II): Define the bootstrap variables, y* = 9* + e*; 9* = xj P + u*;i = 1,- ■ ■ ,m. 

• Step (III): Define the bootstrap version S* of S by replacing yi,-'',ym iii ^ with 

The bootstrap estimators of the bias and the variance matrix of S are now given by 

b = EJ*-S (5.2) 

V = E,{S* - EJ*){S* - EJ*f (5.3) 

where -E* denote the conditional expectation given the data. In simulation, Steps (I)- (III) 
are repeated a large number of times and the average of the bootstrap versions 5*'s gives the 
Monte-Carlo approximation to E^:S* while the sample covariance matrix of the (5*'s give the 
numerical value of the right side of (5.3). 

Next for each of the three MSPE estimators (namely, the Prasad- Rao estimator Mf^, the 
Jiang, Lahiri and Wan estimator M^ , and the proposed estimator Mi{S)) of the small area 
parameter 9i, we calculate the following measures: 

• Relative bias with respect to the empirical MSPE: 

E{MSPEi9,)}-SMSPE{9,) . 

KBi = ,1 = 1 • • • , 12 

SMSPE{9i) 

where, E{MSPE{9i) = ^ Y.r=i MSPE{9i) ' . 

• Empirical coefficient of variation: 

^^^ _ E-2{MSPE{h) - SMSPE{9,)Y .^^ ^^ 



•} •) 



SMSPE{9, 

where, E{MSPE{9i) - SMSPE{9i)Y = ^ Y.f=iWSPE{ei) - SMPE{9i)Y. 

Table 1 reports a summary result of the simulation study. The proposed estimator is denoted 
as 'New' in the table. 

13 



Table 1: Summary of simulation study 

(a) Relative Bias 

min Qi median mean Q^ max 

PR -.164 -.114 -.055 -.061 -.025 .048 
JLW -.210 -.142 -.067 -.095 -.040 -.022 
New -.163 -.113 -.054 -.060 -.024 .048 

(b) Empirical CV 

min Qi median mean Q^ max 

PR .010 .033 .055 .074 .114 .164 

JLW .082 .094 .120 .132 .151 .212 

New .009 .034 .054 .074 .113 .163 

From the above table, it is clear that the proposed estimator and the PR estimator performs 

at par and both perform better than the jackknife-based estimator, particularly in terms of the 

coefficient of variation. We should also mention that, in this simulation study, fortunately the 

jackknife method did not produce any negative MSPE estimates. This is perhaps due to the 

fact that the true parameter values are far away from the boundary of the parameter space. 

6 Conclusions 

In this paper, we described a new method of bias correction for the naive 'plug-in' estimator of 
the MSPE of a function of the small area means h{6i), i = 1, . . . ,m. Unlike the existing methods 
which may produce a negative estimate of the MSPE with positive probability, the estimates 
of the MSPE produced by the proposed method is always nonnegative. Theoretical properties 
of the method are investigated, which in particular show that the resulting estimator of the 
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MSPE attains the same level of accuracy as the existing methods in correcting the bias of the 
naive MSPE estimator. Further, the numerical results presented in the paper shows that the 
proposed method performs at per with the Prasad- Rao (1990) method, and has a slightly better 
performance compared to the estimator based on the jackknife method. A key difference of the 
new method with the existing methods is that while the existing methods apply various bias 
correction techniques to the MSPE function itself, the new method reduces the bias implicitly 
by suitably tilting the value of argument of the MSPE function. 



Appendix 

For a vector x G IR , let x{j) denote the jth component of x, j = 1, . . . ,k. Let C, C(-) denote 
generic positive constants that may depend on the argument(s) (if any) but not on i = 1, . . . ,m 
or m. Also, unless explicitly specified, limits in order symbols are taken letting m -^ cx). 

Proof of Theorem 1: Since Mii{S) ^ 0, by condition S.3.(ii), there exists ei, e2 G (0, eo) such 
that |Mij(a;)| > £2 for all x with \\x — S\\ < ei. Hence, again by condition S, there exists a 
constant C = C(e2) G (0, cx)) such that on the set {\\S — S\\ < ei. 



\Si-s\\ <c iib|| + i|y 



uniformly in i = 1, . . . , m, m > 1. Hence, by Chebychev's inequality, for any e G (0, 00), 



max PI \\6i — 8\\ > e 

i=l,...,m \ 

< P(\\S - 6\\ > €i) + P(c[||b|| + ||y||] > e 



|b|| + ll^l 



< e^^E\\S-Sf + e-'^CE 
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0(m-^] 



This proves part (i). For part (ii), note that 



r(i)/ 



max P( \Ml.'{d)\ < (1 + logm)^ 

i=l,...,m \ 

< p(|M«(^)-M«(<5)|>62/2) 

< P[\\S-d\\>€l^+P(C\\d-S\\'' >€2/2 

= 0(m-^). 



Proof of Theorem 2: Since S is an interior point of A, there exists a €3 £ (0, eo) such that 
{x : \\S — x\\ < £3} C A. Hence, by Theorem 1, 



max PI Si j^ S 

i=l,...,m 



rW{x\\ ^ n _L i^„^\-2 



< P{S^A)+ max P |M-^7((5)| < (1 + logm) 

i=l,...,m \ 

< p(\\S-d\\>e3)+0{m~'^) 



where the last step follows by an application of Chebychev's inequality as in the proof of 
Theorem 1 above. This proves Theorem 2. 

Proof of Theorem 3: By Taylor's expansion and condition S, on the set {S G A^}, 

M2^{S) = M2^{S) + (^ - <5)^VM2,(<5) + Ru (A.l) 

where VM2j(.) is the A; x 1 vector of first order partial derivatives of M2i and Ru is a remainder 
term. By condition S, Ru admits the bound 

\Rii\ < \\S - S\\\\VM2i{S) - VM2i(5°)|| < Cm-'^\\S - d\\^+^ (A.2) 

16 



uniformly in i = 1,- ■ ■ ,m,m > 1 where 5 is a point on the Une joining S and S, so that 
11^ — ^11 < 11^ — ^11- Hence, by (3.1), (3.2), (A.l), (A. 2) and the dominated convergence 
theorem (DCT), 



max \EM2i{d) - M2i{d)\ 

i<i<m 

< max \E{M2iid) - M2iid)}l{d G M)\ + max E{M2iiS) + M2iiS)}l{d i M) 

\<i<m l<i<m 



< max 

l<i<m 
-1 



\\eCS - 6)1(3 G AA)|| • ||VM2.(5)|| + E\Ru\l{S G M) 



+m-'E{GiS) + G{d)}liS i M) 



< max 

l<i<m 

< C 



\\eCS -S)\\+ E\\S - S\\l{5 i AA)} ||VM2i(d)|| + Cm-^E\\b - df +t 
P(b i M) ^ EG(8)\(8 i M) 

1 1 + 7 1 

-1 



o\m 



m-2 + m"i(E||d - df )2 [p(s iN)y + m"i [e\\5 - d 



1 + 7 
\2\ 2 



+ o{m 



(A.3) 



as P{6 ^ M) < €q E\\6 — 5|p = 0{m^^). Without loss of generality, suppose that eo (in the 
definition of AA) is small enough so that for some C G (0, cx)), sup{|M|j (a;)!^-*^ : x G N,j^l = 
1, • • • , fc; i = 1, • • • , m} < C. Let 



k k k 

j=i j=i 1=1 

k k k 

U = j;M|^')(5)&(i) + j;j;c(j,/)M(f)(<5)F(j,/), 
j=i i=i 1=1 



i = 1, . . . ,m, where c{j, I) = 1/2 for j ^ I and c{j, I) = 1 for j = I. Then by Taylor's expansion, 
it follows that there exists a constants C G (0, cx)) (not depending on i) such that on the set 



Li = Li + i?2j, say 



(A.4) 
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and 

Si = S j-^ — ei + RsiEi (A.5) 

for i = 1, • • • , m where maxi<j<m |i?2j| < C\ II^IMI^~<^II + II^~^II'''II^II f and for alH = 1, • • • , m, 

liisil <c{|Li|.||^-d|| + |i?2»|}. (A.6) 

By similar arguments, on the set An = {S £ N} n {Si G A}, we may write 

k k 

j; j;c(j,/)Mjf (z.^, + (1 - u)6){U3) - m)m) - m) 
j=i 1=1 

k k 

j=i 1=1 

i = 1,- ■ ■ ,m;u £ [0, 1], where 

sup max \R4i{u)\ < C \\\5i - df +^ + ||^i - 5\\ ■ \\5 - 5\\ + \\5i - 5\f\ (A.7) 

„6[0,1] l<»<m L J 

for some C G (0, oo). 

On the set A2i = An n {6i G J\f} = {6 £ J\f} n {6i G AA}, by Taylor's expansion, there exists 
a point 5* on the line joining 5i and 6 such that 

Mu{5i)-Mu{8) 

3=1 j=l 1=1 

k / 



j; M^(8) {5{j) - 5{j)) + M«(5) --^ + R,, 

3=1 \ ^li {") / 

k k 

j=l 1=1 

k 
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k k 



+ E E ^(^■' ^Mf (^) { [Hj) - su)) [Hi) - s{i) 

j=l 1=1 

+Mi]\S)R3^ + Rl 



V{j,l)} 



Hi 



rW 



Qu + M\l>{5)R3, + Rl,, say 



(A.8) 



where R\^ = RAi{u) with the u corresponding to 5*. 

Hence for i = l,--- ,m, with Asi = {\m[1\S)\''^ < (1 + logm)2}, 



iSJj 



Mu{S)-Muid) 

[MuiSi) - Mu{S)]l[ {Si G A} n A3.) + [MuCS) - Mu{S)]l[{S^ ^ A} U A 

[Muis^i] - Mu{s)]{i{A2i) + ^6^eA)- ^A2^)}^A 

+ [Mu{S) - Mu{S)]l[{Si ^ A} U Al>j 
Qu + M2)(5)i23* + Rl] HA2i n ^3^) + ^5», say 
Qii + i?6i,say, 



(A.9) 



where |i?6i| < |^5i| + |^3i + ^li|a(^2i) + |Qii|l(^ii n A§J and 

\R^i\ < \Mu{5i) - M{5)\ ■ \\{5i G A) - \{A2i)\\{A^i] 
+\Mu{8 - Mu{6)\\{{5^ ^ AA} U Al^) 
= i?5ij, say. 



Note that by definition, 



I \{6i G A) - a(A2i) I 

< \{h G A)l(AiJ + l(d, ^ A)IL(A2,) 

< {l(d^AA) + ]L(a, G A\AA)} + 1(0). 



(A.IO) 
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Hence, with Al^ = {Si ^ Af} D A^i, 



R51i < \Mu{Si)-Mu{S)\^A3^){HHJ^) + HSi^^\^f)} 

+2 I Mu{S) - Mu{8) I [\{5 iN) + \[{h i N} r\ A3.) + 1(A^,)} 

< C\\~di - 6\\\{A3,)[\{5 iN) + 1(5, G A \ AA)} 
+C\\6-5\\[\{6iM) + \{Ai;) + \{Al,)} 

< C ■ (logmfiWb + \\V\\}{HS i N) + \{Al^} 

+C ■ \\6 - 5\\[\{5 iM) + \{Al^ + \{Al,)]. (A.ll) 



By condition S, there exist C G (0, cxd) and ei G (0, y) such that 



At, c {\\5-8\\>^]yj{\\h-S\\>^] 

C {\\6 - 5\\ > 61} U {(logm)2(||b|| + II V||) > C] 



(A.12) 



and Ag^ C {||5 — <5|| > ei} for ah i = 1, • • • , m, m > 1. Hence, it follows that 



R^u < C-(logm)2{||6|| + ||y||} ]l(||<5-<5||>ei) + ]l [logm]2(||&|| + ||F||)>C 



+C- \\S-S\ 



m\S - S\\ > ei) + a [logm]2(||fo|| + IIFII) > C 



(A.13) 



for all i = 1,- ■ ■ ,m,m > 1. Let Wi = {\\a\\ + ||5]||). Note that by uniform integrability of 
{{^/m\\S - S\\y}rn>i and the fact that E \ Wi 1^+^'= 0(1), 



max E{R^ii 

l<i<m 



< Cm ""^(logTTi)^ 



E\Wi 1^+" j'+'fPdl^-^ll >ei)'+'' +-E| |W^i |i+''{m-^(logm)2}'' 



+C 



1/2 



e^^^ll^ - Sfl{\\S - S\\ > ei) + (e\\S - Sf) i.P{m~\logmf\Wi\ > C)\ 



o{m ) as TTT, ^ cx). 



(A.14) 



This completes the proof of Theorem 3. 
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